MIPS instruction Encoding

Home Up MIPS History MIPS to ARM MIPS image

Home Teaching Glossary ARM Processors Supplements Prof issues About

MIPS instruction Formats

An interesting facet of people is that we are quite adepts of thinking of new ways of describing things. Once we do think of a new way of characterizing things that then becomes a way of describing other things. In other words, we start to see things through the lens of the new categorization system.

For years, computers were categorized by major groupings (RISC/CISC, 16-/32-bit, Intel/Motorola). There instruction sets were characterized by functionality: arithmetic, logic, shift, data movement, flow control, and so on. And then along came MIPS. MIPS instructions are characterized bt their encoding; for example, R_type or I-type instructions. Virtually all discussions of MIPS processors use the instruction format to categorize instructions.

A few years ago, such a categorization would seem a strange as categorizing spending by recoding the serial numbers of banknotes. Apart from some special cases such as the Motorola 68000’s use of line-A and line-F instructions to categorize exception types, no one ever thought about instruction encoding because it was some forma or arbitrary instruction labeling carried out by the chip design. It was no more relevant than the chassis number on the engine block in your automobile.

MIPS changed things. Instruction encoding that was once invisible, suddenly jumped into the light. Why? Because the number of bits in an instruction is fixed and the designer has to design the optimum bit-to-instruction mapping that maximizes the professor’s power. The MIPS instruction encoding was an inspired piece of engineering.

Suppose the MIPS designers had taken a simplistic and naïve approach to instruction set design. They could have argues something like. There are eight types of instruction, so that’s 3 bits. Each instruction class can have up to 16 different members or variations, and that’s another 4 bits. If we have three sets of 32 registers, there goes another 3 x 5 = 15 bits. So, far we’ve got through 3 + 4 + 15 = 22 bits. If we stop here, we have 32 – 22 = 10 bits of a literal. That’s a tad disappointing.

MIPS uses a form of Huffman encoding which uses a variable length code; For example, a Huffman code with four classes might be

Class 0 0xxxxxxxx

Class 1 01xxxxxxx

Class 2 001xxxxxx

Class 3 000xxxxxx

As you can see, each class has a unique prefix than can be rapidly decoded and the rest of the bits can be used to define a specific instance of an instruction etc. This approach allows us to use instructions with a short prefix as a means of providing longer literals.

J-type Format

The MIPS encoding system identifies three major classes: R-type, I-type, and J-type. Let’s begin with the J-type.

A J-type instruction divides the 32-bit op-code into a 6-bit code field and a 26-bit literal field. The “J” indicates “jump” and this class is used for jumps. The six most-significant bits of the J-type instruction are 000010 and the remaining 26 bits are used to provide a 28-bit target address. The address is 28 bits rather than 26 bits because MIPS is byte addressed and all instruction fall on a …xxxx00 boundary. The actual target address is obtained by appending two 0s to the literal and then sign-extending the result to 32 bits.

Note that the J-type op-code 000011 represents a jump and link instruction. This is the same as a jump except that the return address is saved in the link register.

R-type Format

The workhorse of the MIPS instruction set is the R-format (here “R” indicates “Register”). The R-format performs the reg0ster-to-register data-processing operations of the MIPS.

In MIPS terminology, the three registers are designated rs (source), rt (source) and rd (destination) and a typical operation is addu rs,rt,rd that adds [rd] ¬ [rs] + [rt]

The op-code is used in conjunction with the subfield in bits 0 to 10 to define the specific operation and provide any parameters for classes of operation such as shift.

I_type Format

One of the most interesting formats is the I-type format where the “I” indicates immediate; that is, the instruction carries a literal field. This is interesting because, traditionally, CISC processors have regarded instructions will literals as simply a variant of the parent operation; for example, add and add literal. MIPS creates an I-type because of the encoding technique. One of the three registers is sacrificed to the constant field and concatenated with the subfield as the following figure demonstrates.

The I-type format implements three operations. First, there’s the typical register-to-register operations with a 16-bit literal such as addu $4,$5,0x1234. Note that the literal is a 16-bit value.

The next class of literal are the branch instructions such as bne $6,$2,target. Here the branch performs a comparison between the two registers and executes a relative branch to target if they are not equal. The target is a 16-bit offset that is extended to 18 bits by adding 00 to the least-significant bits (remember all addresses are on 4-byte word boundaries). Finally, the 18-bit offset is sign-extended to 32 bits and added to the program counter, PC to give a relative offset.

The third member of the I-type class is the load and store instructions because these use a pointer register, a source or destination, and a literal offset; for example, lw $t1,0x1234($at)copies the 32-bit value at memory address [$at] + 0x1234 and puts the result in register $t1. Note that the offset is a 16-bit value sign-extended to 32 bits.

This list is not exhaustive. There are special instructions such as coprocessor instructions that perform operations on co-processors. The coprocessor instructions have the op-code 0100xx, where xx, describes four options. The remaining 26 bits are divided into 5,5,5,5,6 bit fields to define coprocessor registers and pass details of the operation to be performed.